arXiv:cs/0211031vl [cs.AI] 22 Nov 2002 


Redundancy in Logic I: 
CNF Propositional Formulae 

Paolo Liberatore* 

February 1, 2008 — 19:08 


Abstract 

A knowledge base is redundant if it contains parts that can be 
inferred from the rest of it. We study the problem of checking whether 
a CNF formula (a set of clauses) is redundant, that is, it contains 
clauses that can be derived from the other ones. Any CNF formula 
can be made irredundant by deleting some of its clauses: what results 
is an irredundant equivalent subset (I.E.S.) We study the complexity 
of some related problems: verification, checking existence of a I.E.S. 
with a given size, checking necessary and possible presence of clauses in 
I.E.S.’s, and uniqueness. We also consider the problem of redundancy 
with different definitions of equivalence. 
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1 Introduction 

A knowledge base is redundant if it contains parts that can be removed 
without reducing the information it carries. In this paper, we study the 
redundancy of a propositional formula in Conjunctive Normal Form (CNF), 
that is, sets of clauses. A CNF formula is redundant if and only if one or 
more clauses can be removed from it without changing its set of models. 

The problem of redundancy, and the related problem of minimization, 
are important for a number of reasons. First, removing redundant clauses 
leads to a simplification of the knowledge base. This may have some compu¬ 
tational advantage in some cases (e.g., it leads to an exponential reduction of 
size.) Moreover, simplifying a formula leads to a representation of the same 
knowledge that is easier to understand, as a large amount of redundancy may 
obscure the meaning of the represented knowledge. The irredundant part of 
a knowledge base can instead be considered the core of the knowledge it 
represents. 

Redundancy can be a negative characteristic or not, depending on how 
the knowledge base is obtained. Intuitively, a concept that is repeated many 
times (for example, in a book) is likely to be a very important one. If a 
formula results from the translation of something expressed by human beings, 
the fact that a clause is redundant is noteworthy, as it may indicate that this 
clause carries a piece of knowledge that is considered important. 

On the other hand, redundancy may be a negative feature of a knowl¬ 
edge base, as it may result from an incorrect encoding or merging of several 
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sources. In such cases, indeed, it is possible that the intended meaning of 
a clause is different from what the clause formally means (for example, the 
clause has been expressed using the wrong variable names.) Whatever the 
reason a clause is redundant, the fact that it is redundant is an hint of some¬ 
thing, which may be either an high importance of the knowledge it express, 
or an hint of a mistake that has been made while building the knowledge 
base. 

The problem of redundancy of knowledge bases may also be relevant 
to applications in which efficiency of enta ilm ent is important. Indeed, the 
size of a knowledge base is one of the factors that determine the speed of the 
inference process. While some theorem provers introduce a limited number of 
redundant formulae for speeding up solving, excessive redundancy can cause 
problems of storage, which in turns slows down reasoning. In particular, 
updates can increase the size of knowledge bases exponentially [CDLS99, 
LibOO], and redundancy makes the problem of storing the knowledge base 
worst. 

Algorithms for checking redundancy of knowledge bases have been de¬ 
veloped for the case of production rules [Gin88, SS97]. In this paper, we 
instead study redundancy of propositional knowledge base in CNF form, 
that is, checking whether a clause in a set is implied by the others. 

A related question that has been already investigated in the propositional 
case is whether a knowledge base is equivalent to a shorter one. This problem 
is called minimization of propositional formulae , and it has been one of the 
first to be analyzed from the point of view of computational complexity: its 
study begun in the paper that introduced the polynomial hierarchy [MS72]. 
A complexity characterization of this problem has been first given for Horn 
knowledge bases [Mai80, ADS86, HK93]; afterwards, the problem has been 
tackled again in the general case [HW97, Uma98]. While the Horn case is 
now quite understood (the problem is NP-complete, using several different 
notions of minimality,) some problems regarding non-Horn formulae are still 
open. For example, the problem of deciding whether a formula is minimal (no 
other formula with less literals is equivalent to it) is trivially in Sf, but has 
only be proved coNP-hard quite recently [HW97], and no other strict bound 
is known. What makes this problem difficult to handle is the fact that the 
considered formulae are not constrained to any particular form, such as CNF 
or DNF, or even NNF. 

Redundancy elimination can be considered as a weak form of formula 
minimization: if a set of clauses is redundant, it is not minimal, as some 
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Problem 

Checking irredundancy 
A set is an I.E.S. 

Existence of an I.E.S. of size < k 
A clause is in all I.E.S.’s 
A clause is in an I.E.S. 
Uniqueness of I.E.S.’s 


Complexity 
NP complete 
D p complete 
T 12 complete 
NP complete 
E 2 complete 
A 2 [log n] complete 


Table 1: Complexity results about redundancy 


clauses can be removed from it while preserving equivalence. On the other 
hand, redundancy elimination only allows for removal of clauses, so it is not 
guaranteed to produce a minimal knowledge base. For example, {x\/y, x\/^y} 
is irredundant, but is equivalent to a shorter set: {x}. A related problem, 
not analyzed in this paper, is that of removing redundancy from a single 
clause, that is, removing literals from clauses rather than removing clauses 
from sets. The computational analysis of this problem, and of related ones, 
has been done by Gottlob and Fermiiller [GF93]. 

The problem of redundancy elimination is relevant for at least two rea¬ 
sons. First, it seems somehow easier to remove redundant clauses, rather than 
reshaping the whole knowledge base. Indeed, removing redundant clauses can 
be done by checking whether each clause can be inferred by the other ones, 
while finding a minimal equivalent formula involves a process of guessing and 
checking a whole knowledge base for equivalence. Even for short knowledge 
bases, the number of candidate equivalent knowledge bases is very high. 

A second reason for preferring redundancy elimination to minimization 
is that the syntactic form in which a knowledge base is expressed can be 
important. For example, some semantics for knowledge base revision depend 
on the syntax of knowledge bases. If a knowledge base is replaced with an 
equivalent one, even a single update can lead to a completely different result 
[Gin86, Neb91]. 

Several problems are related to that of redundancy. The aim of checking 
redundancy is to end up with a subset of clauses that is both equivalent to 
the original one and irredundant. We call it an irredundant equivalent subset 
of the original set, or I.E.S. Note that an I.E.S. is a subset of the original 
set, and can therefore only contain clauses of the original set. This makes 
it different to a minimal equivalent set , which can instead be composed of 
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arbitrary clauses. 

The problems that are analyzed in this paper are: checking whether a set 
is an I.E.S.; checking the existence of an I.E.S. of size bounded by an integer 
k; deciding whether a clause is in some, or all, the l.E.S.’s; and checking 
uniqueness. Table 1 contains the complexity of these problems. 

Since redundancy is defined in terms of equivalence (a formula is redun¬ 
dant if it is equivalent to a proper subset of its,) alternative definitions of 
equivalence lead to different definitions of redundancy. We have considered 
two definitions of equivalence, both based on the sets of entailed formulae. 
Namely, var-equivalence [CDSS97, LLM03] leads to an increase of complexity, 
while conditional equivalence [LZ] does not. 


2 Redundancy and I.E.S.’s 

In this paper, we study the redundancy of sets of propositional clauses. A 
knowledge base is redundant if it contains some redundant parts, that is, it 
is equivalent to one of its proper subsets. The definition therefore is affected 
by three factors: 

1. the logic we consider; 

2. what is “a part” of a knowledge base; 

3. the definition of equivalence. 

In this paper, we use propositional logic. Nevertheless, even in this simple 
case, we still have the problem of defining what is a part of a knowledge base. 
For example, we can consider a knowledge base a set of formulae, and a part 
is simply one formula. A restricted case is that of CNF: a knowledge base is 
a set of clauses, and a part is simply a clause. We could also consider generic 
Boolean formulae, and a part of them is any subformula. 

We however only consider CNF formulae in this paper. We initially con¬ 
sider the usual definition of equivalence: other definitions are considered in a 
later section. We sometimes use formulae like aq A- • -A a m —> b\V- ■ -Viy, which 
can be easily translated into the equivalent clauses -icq V- • -V-ia m V6i V- • -V^. 
We also assume that clauses are not tautological. 

Definition 1 A CNF formula is a set of non-tautological clauses. 
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Clearly, tautologies can be easily checked and removed, and do not change 
the complexity of the problems considered here. The redundancy of a single 
clause is defined as follows. 

Definition 2 A clause 7 G II is redundant in II if and only z/II\{ 7 } |= 7 . 

The redundancy of a clause implies that the clause can be removed from 
the set without changing its meaning. In turns, the redundancy of a set of 
clauses can be defined as its equivalence to one of its proper subsets. 

Definition 3 A set of clauses II is redundant if and only if there exists 
II'cII such that II' = II. 

In propositional logic, this definition is equivalent to the following ones 
(proofs are omitted due to their triviality): 

1. there exists II' C II such that II' |= II; 

2. II contains a redundant clause. 

These definitions are equivalent in classical logic, but they are not in 
other logics: for example, in non-monotonic logic II' f= II may hold, but still 
it ^ n even if n' C II. In the same way, it can be that no part of the 
knowledge base is implied by the other ones, but still there exists a proper 
equivalent subset of it [Libb], 

A related definition is that of irredundant equivalent subset. Such sets 
result from removing some redundant clauses while preserving equivalence. 

Definition 4 A set of clauses II' is said to be an irredundant equivalent 
subset (I.E.S.) of another set of clauses II if and only if: 

1 . n'cn 

2 . n' = n 

3. IT is irredundant 

The second point can be replaced by II' |= II for all monotonic logics. An 
alternative definition is that an I.E.S. is an equivalent subset of the original 
set such that none of its subsets has the same properties. Any set of clauses 
has at least one I.E.S., but it may also have more than one of them, as shown 
by the following example. 
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Example 1 Let II = {a V ->b, ->a V b, a V c, 6 V c}. This set has two I.E.S. ’s: 


IU = II\{a V c} 

n 2 = n\{ 6 vc} 

It is indeed easy to see that the first two clauses of II are equivalent to 
a = b, which implies that a V c and b V c are equivalent. It is also easy to 
see that neither a V ->b nor -ia V b can be removed from II while preserving 
equivalence with it. 

The set of clauses of this example can be used to show that a set of clauses 
may have exponentially many I.E.S.’s. Consider the set: 

n „= (J fl [{a/ai,b/bi,c/ci}] 

In words, fl n is made of n copies of n, each built on its own set of three 
variables. While removing clauses from fl n , we have n independent choices, 
one for each copy: for each i we can remove either a* V c* or bi V c*. This 
proves that 2 n outcomes are possible, each leading to a different I.E.S. 

Since a formula may have more than one I.E.S., its clauses can be par¬ 
titioned into three sets: the ones that are in all I.E.S.’s, the ones that are 
in some I.E.S.’s, and the ones that are in no I.E.S. The idea is that the first 
clauses are necessary (they cannot be removed from the set without changing 
its semantics), the last ones are useless (their removal is harmless), while the 
other ones are “useful but not necessary”. We therefore give the following 
definitions. 

Definition 5 A clause j in H is: 
necessary: it is in all I.E.S. ’s; 
useful: it is in some I.E.S. ’s; 
useless: it is not in any I.E.S. 

Note that useful clauses include all necessary ones, and that useless and 
useful are opposite concepts. In terms of knowledge, necessary clauses express 
knowledge in a succinct form, as they are not redundant at all. Useless 
clauses can instead be considered “strongly redundant”: not only they can 
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be removed; they can always be removed. In a sense, they are not saying 
anything useful from the point of view of the knowledge they express. On the 
other hand, their presence may be important at a meta-level. For example, 
the strong redundancy of a clause 7 may indicate that the information it 
carries is very important. It may also indicate that the piece of knowledge 
it represents has been outdated by successive addition, but further additions 
to the knowledge base may require backing up to the part of knowledge we 
currently regard as useless. Either way, useless parts may in some cases 
be useful at a meta-level. Finally, useful but not necessary clauses express 
knowledge that the knowledge base contains in some other form, that is, these 
clauses represent “one possible way” of telling this information. As for all 
redundant clauses, they may tell that the knowledge they carry is regarded 
as important, but they may even indicate that mistakes has been made in 
the construction of the knowledge base, so that two clauses that are believed 
to say something different in fact do not. 

Technically, checking whether a clause is necessary is easy, as it does not 
require cycling over all possible I.E.S. 

Lemma 1 A clause 7 is necessary in II if and only */II\{ 7 } ]f= 7 . 

Proof. If n\{ 7 } \f= 7 , then 7 belongs to all I.E.S.’s: this is an easy conse¬ 
quence of the fact that no subset of II\{ 7 } can imply 7 . What remains to 
prove is that II\{ 7 } \= 7 implies that there is an I.E.S. of II that does not 
contain 7 . We can build this I.E.S. as follows: we start from II\{ 7 } and 
iteratively remove clauses that can be derived from it, until we obtain a set 
from which no clause can be removed. This is clearly an I.E.S., and it does 
not contain 7 . □ 

Checking inutility of a clause cannot be expressed with a simple condition 
like this one. This is shown by Theorem 5, which proves that the opposite 
problem of telling whether a clause is useful is iTbcomplete. As a result, 
the definition of uselessness cannot be expressed as a single entailment check 
(like the one for necessity) unless exponentially large formulae are used. 

Any set of clauses has at least one I.E.S. Checking the existence of an 
I.E.S. is thus trivial. On the other hand, a set may have more than one I.E.S. 
Deciding uniqueness of I.E.S.’s for a specific set of clauses is important, as it 
tells whether there is a choice among the possible minimal representations of 
the same piece of information. For example, a trivial algorithm for producing 
an I.E.S. is that of iteratively removing the first clause that is implied by the 



other ones. This algorithm clearly outputs an I.E.S. However, other ones 
may exist, and be better either because are shorter (have less clauses), or 
because their structure make them more effective to use (for example, they 
are Horn or in a similar special form that makes reasoning with them easier.) 

This problem is also of interest because uniqueness implies that all clauses 
are either necessary or useless. As a result, checking usefulness and inutility 
becomes the same and opposite problem of necessity, respectively. Therefore, 
they become much simpler than in the general case. 

Clearly, if a set is irredundant, it has a single I.E.S. On the other hand, 
some sets may be redundant but have a single I.E.S. anyway. The following 
example shows such a set. 

n = {a V b, a V ->b, a V c} 

The first two clauses are in fact equivalent to a, which makes a V c redun¬ 
dant. On the other hand, a V c cannot be used to infer a. As a result, the 
only I.E.S. of this set is composed of its first two clauses. 

The condition of uniqueness is formally defined as: there exists exactly 
one n' that is a subset of n and is irredundant. However, the following lemma 
shows an easier why to determine whether a set of clauses has a single I.E.S. 

Lemma 2 A set of clauses n has a unique I.E.S. if and only if Pn = n, 
where n N is the set of necessary clauses: 

= {7 e n | n\{7} ^ 7} 

Proof. If n has a unique I.E.S., then its clauses are exactly the clauses that 
are in all I.E.S.’s of n. Lemma 1 tells that the clauses that are contained in 
all I.E.S.’s can be expressed as Hat. 

Let us now assume that Hat |= H, and prove that Hat is the unique I.E.S. 
of n. Since no clause of Hat is implied by the rest of n, it is not implied 
by the rest of Hat either, which proves that Hat is an I.E.S. We only have to 
prove that n does not have any other I.E.S. Assume, by contradiction, that 
n' 7 ^ H N is an I.E.S.: if Hat C n', then n' is not irredundant; otherwise, 
there exists 7 G nAr\n'. This condition can be decomposed into 7 G Hat 
and 7 (f n'. The first formula implies n\{7} (7 7 since Hat is the set of 
necessary clauses. The second formula, together with n' |= n, implies that 
n'\{ 7 } H 7 - This is a contradiction, as n'cn. □ 
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The following condition is sufficient for proving that a set of clauses has 
more than one I.E.S. Intuitively, while removing redundant clauses from a 
set, we may arrive to a point in which we have a choice to make between 
removing one clause in a pair. If this is the case, this choice produces two 
different I.E.S.’s. 

Lemma 3 //II\{ 7 i} = II and II\{ 7 2 } = II but n\{ 7 i, 7 2 } ^ II, then II has 
at least two I.E.S. ’s 

Proof. Since any set of clauses has at least an I.E.S., the same happens for 
II\{7i}. Let therefore IE be an I.E.S. of II\{7i}. Since II\{7i} is equivalent 
to II, this is also an I.E.S. of II. It does not contain 7 x because it is a subset of 
n\{ 7 i}. We show that it necessarily contains y 2 . Suppose it does not: then 
IE C n\{7i,7 2 } which, by assumption, is not equivalent to II, contrarily to 
the claim that it is. 

We have therefore proved that any I.E.S. of n\{ 7 i} is an I.E.S. of II that 
contains y 2 but not 71 . For the same reasons, any I.E.S. of II\{ 7 2 } is an 
I.E.S.of II that contains 71 but not 7 2 . As a result, the I.E.S.’s of II\{ 7 1 } 
and n\{ 7 2 } are all different. Since each set has at least an I.E.S., we have 
proved that II has at least two I.E.S.’s. □ 

This condition is however not necessary. Indeed, the choice between two 
clauses may show up only when some redundant clauses have already been 
removed. This happens for example when two clauses out of three have to 
be removed, like in the following set: 

II = {a = b, a = c, a V d, 6 V d, c V d} 

The clauses composing the first two formulae are necessary. Since a, b , 
and c are equivalent, and the last three clauses are equivalent as well. As a 
result, we can always remove two of them. This proves that the condition of 
the theorem (which requires two non-necessary clauses not to be removable 
at the same time) is false, while the set has more than one I.E.S. 

Let us now show some properties that will be useful for the complexity 
analysis of problems related to redundancy and I.E.S.’s. Checking whether 
a specific clause is redundant is easy to characterize from a computational 
point of view, as it amounts to exactly one entailmcnt test: II\{ 7 } |= 7 . On 
the other hand, results about the redundancy of a whole set are harder to 


10 



obtain, as we have to make sure that the clauses we define do not interact 
to form redundancy when they should not. In other words, we can still 
use the fact that proving that hl\{ 7 } |= 7 is coNP-complete, but this is a 
real reduction only if II does not contain any other redundant clauses. The 
hardness proofs in this paper are indeed based on the following method: 

the formula resulting from a reduction contains parts that are 
known to be irredundant. 

These parts will be then useful, because they express some constraints on 
the model, while they do not affect redundancy. Since the only parts that 
are “known to be irredundant” are the necessary clauses, this method can 
be used for problems about I.E.S.’s as well. The following definition shows 
how clauses can be made irredundant. 

Definition 6 The irredundant version of a set of clauses T = { 7 !,... , 7 ,„} 
is defined as: 


r [C] = {a - 7i l 7i e r} 

where C = {< 7 ,..., c m } are variables of the same number of the clauses of T 
(ci —» 7 j denotes the clause ->Ci V %.) 

The point of this definition is that T [C] is composed of necessary clauses 
only. The following lemma shows exactly how this can be proved. 

Lemma 4 For any set of clauses T containing no tautologies, the model uj t 
below satisfies all clauses of T[C] but Ci —> 77 

C 7 (T, C) = {< 7 } U {—'Z 7 | lj G 7 i} 


Proof. oui(T,C) is not a model of <7 —> 7 «, as we assumed that no clause is 
tautological. On the converse, it is a model of T[C ']\{<7 —> 7 *} simply because 
it falsifies all Cj’s with j ^ i. □ 

This lemma actually proves that all clauses of T [C] are irredundant. We 
do not state the lemma this way because the reductions use T [C] in conjuction 
with other clauses: in order to prove that the clauses of T [C] are irredundant, 
we extend the models ay(T, C ) in such a way they satisfy all other clauses. 
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While it is simple to prove that T is unsatisfiablc if and only if T [C] \= 
->Ci V- • -V-ic m , we cannot simply add this clause to r [C] to show the hardness 
of the irredundancy problem, as this clause may make some clauses of T[C] 
redundant. The complete proof requires adding a new variable to that clause 
to avoid this problem. 

Lemma 5 For any set of clauses T, none of the clauses of Y [CJ] is redundant 
inY[C,a] below: 


r[C, a] = r[C] U { — 1 C 1 V • • • V ~>c m V — ia} 

where a is a new variable, while the clause -icy V • • • V ->c m V ->a is redundant 
(i.e., T[C] |= —>ci V • • • V —>Cm V -i a) if and only ifT is unsatisfiable. 

Proof. Lemma 4 proves that ay [C] is a model of all clauses of T[C'] but 
c t —> 7 j. Since it is also a model of the last clause (a is implicitly assumed to 
be false in ay [C] ), no clause of T [C] is implied by the other ones. Let us now 
prove that the redundancy of the last clause is related to the satisfiability of 

r. 

T is unsatisfiable. Since T has no models, no model of Y[C] contains all 
cf s. As a result, T [C] |= ->c i V • • • V ~<c m , which implies that T [C] |= 
—>Ci V • • • V ~>c m V -i a, which in turns implies that T [C, a] is redundant. 

T is satisfiable. We prove that the last clause of Y \C, a] is irredundant (the 
other ones have already proved to be so.) Since T is satisfiable, it 
has a model oj. By setting all cf s and a to be true, we obtain the 
model oj U {cy, ..., c m , a}, which satisfy r[C]. This is not a model of 
—>Ci V • • • V ->c m V —'a: as a result, the last clause is irredundant. 


□ 

This lemma is used not only in the proof of hardness of the problem of 
checking redundancy, but also for the proof of hardness of other problems 
(such as checking necessity of a clause.) 

3 Complexity Results 

In this section, we show the complexity results that are summarized in Ta¬ 
ble 1. The first result is about the complexity of checking whether a set of 
clauses is redundant. 
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Theorem 1 Checking irredundancy of a set of clauses is NP -complete. 

Proof. Membership: we have to check whether, for any 7 e II, it holds 
n\{ 7 } ^ 7 . This can be done by guessing a model for each set II\{ 7 }U {~> 7 }, 
which shows the problem to be in NP. 

Hardness is an easy consequence of Lemma 5: a non-tautological set of 
clauses F is satisfiable if and only if F [C, a] is irredundant. □ 

What this theorem proves is that checking irredundancy of a set of clauses 
is not harder, theoretically, than checking whether a single clause is irredun¬ 
dant. Although the problem looks harder than entailmcnt, it is indeed the 
hardness proof the more complex part of the completeness proof (it requires 
using Lemma 5, which in turns requires Lemma 4.) This is because redun¬ 
dancy does not immediately allow expressing enta ilm ent, (the irredundant 
version of a set of clauses has been introduced exactly for solving this prob¬ 
lem.) 

Let us now turn to the problems related to I.E.S.’s. The first problem is 
that of checking whether a set of clauses is an I.E.S. of another one. This 
problem clearly requires checking equivalence and irredundancy. The follow¬ 
ing theorem actually proves that the problem is hard for the class D p , which 
contains all problems that can be decomposed into a problem in NP and a 
problem in coNP. 

Theorem 2 Given two sets of clauses n and n', checking whether n' is an 
I.E.S. of FI is F) p -complete. 

Proof. Membership amounts to showing that n' C n (a polynomial task), 
that W |= n (which is in coNP) and that n' is irredundant (which we proved 
to be in NP). Therefore, the problem is in D p . 

Hardness is proved by reduction from the sat-unsat problem: given a pair 
of sets of clauses (T, E), check whether the first one is satisfiable while the 
second one is not. This problem is D p -complete even if F and E do not share 
variables [BG82], which we assume. Let C and D be new sets of variables in 
one-to-one correspondence with the clauses of F and E, respectively. Let a 
and e be two other new variables. Reduction is as follows: 

n = r[C,a]UE[D,e] 

n' = r[c>] ue[l>] 
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First, we show that II' is irredundant if and only if F is satishable. By 
Lemma 4, E [D] is irredundant. Lemma 5 proves that T[C, a] is irredundant 
if and only if F is satishable. Since these two subsets of II' do not share vari¬ 
ables, IF is irredundant if and only if both parts are, that is, II' is irredundant 
if and only if F is satishable. 

What remains to prove is only that IF f= II if and only if E is unsatis- 
hable. By Lemma 5, E [D] \= —>d\ V • • • V -i d r V —>e holds if and only if E is 
unsatishable. □ 

Given that a set of clauses can have more than one I.E.S., it is of interest 
to check the size of minimal I.E.S.’s, as it tells the amount of redundant 
information the theory contains, and also how much the size of the knowledge 
base can be reduced by deleting redundant clauses. The decision problem 
we consider is that of checking the existence of I.E.S.’s of size bounded by a 
constant integer. This problem can be solved by iterating over all possible 
I.E.S.’s. Such a procedure amounts to checking whether there exists a subset 
that is a I.E.S. and has the given size. The following theorem tells that this 
iteration cannot be avoided in general. 

Theorem 3 Given a set of clauses II and an integer k, deciding whether II 
has an I.E.S. of size at most k is Y^-complete. 

Proof. Membership: the problem amounts to deciding whether there exists 
a subset of II that is equivalent to it and of size at most k. Since the problem 
can be expressed as a 3VQBF, it is in Uf 

Hardness is proved via a quite complicated reduction from 3VQBF. Let 
3AT/T.-T be a formula, where F = {yx,. .., y m } is a set of clauses. This prob¬ 
lem is Ef-hard, as it is the complement of the problem of deciding whether 
a V3QBF, in which the matrix is a CNF formula, is valid [SM73]. We build 
a set n as the union of the following sets of clauses: 


Hi = U KWi} 

i=l,...,n 

n 2 = (J [x] A • • • A x\ —>• x h z] A • • • A z\ — > Zi} 

n 3 = [J i X * W *> Z i W i\ 
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n 4 = U {wi A • • • A w n ^ 7^} 
n 5 = {ui, • • • , v t , -iVi V • • • V -iVt} 

Here, 7 ^ is obtained from 7 j by replacing every positive occurrence of 07 
with -1 Zi. The values of the constant k, r, and t are chosen as follows: if n is 
the number of variables and m the number of clauses of T, we set r = m + 1 , 
k — (r T 2) • n T m, and t — k + 1. 

We prove that 3XVY.-T is valid if and only if n = n 4 U n 2 U n ; 3 U n 4 U n ,5 
has an equivalent subset of size at most k. 

The set n is unsatisfiable because n 5 is unsatisfiable. Therefore, we are 
looking for a subset of n of size at most k that is unsatisfiable. Note that, 
removing even a single clause from ns, it becomes satisfiable. Since n ,5 does 
not share any variable with the other subsets, it follows that no proper subset 
of n 5 can contribute to the generation of unsatisfiability. Since t > k, if an 
unsatisfiable subset of size less than k contains clauses from n 5 , they can be 
removed while maintaining unsatisfiability. As a result, while looking for an 
unsatisfiable subset of n, clauses of n 5 can be disregarded: these clauses are 
only used to guarantee that n is unsatisfiable. 

We have therefore proved that n has an I.E.S. of size bounded by k if 
and only if n 4 U n 2 U n 3 U n 4 has an inconsistent subset of size bounded by 
k. Let us therefore consider II' C Ilj U n 2 U n 3 and n" C n 4 , and see what 
happens when n' U n" is an unsatisfiable set of at most k clauses. 

First, neither n' nor n" is unsatisfiable alone, as both n 4 U n 2 U n 3 and 
n 4 are satisfiable (the first is satisfied by the model that evaluates to true 
all variables, the second by the model that evaluates to false all variables.) 
Second, if n' does not imply all Wi s, then n'un 4 is satisfiable, and therefore 
n' U n" is satisfiable as well. There exists exactly two minimal subsets of 
hi U n 2 U Il 3 that imply w t : 


£* = (J {x’i} U {x} A • • • Ax- —>• x h Xi —>• Wi) 

7 = 1 , —,r 

S' = [J {zj}U{z[ A--- A z? Zi->Wi} 

These two sets have the same size. The number k has been chosen so that 
k = n ■ (r + 2) + m = n • |S, : | + m. Since all WiS have to be implied, £* C n' 
or S' C n' for each i. Since m < |S*|, we have that k < n ■ (|S, : | +1), that is, 
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IT cannot contain more than n sets E, ; or E'. More precisely, r + 1 = m + 2 
other clauses are necessary to imply another a:j or Zj, which are shared with 
II". Therefore, II' must contain exactly one group among E, and E' for any 
i, which amounts to n • (r + 2) clauses. The remaining m clauses can be taken 
from IT'. Since II4 has size m, we can simply take fl" = II4. 

We have proved that fT implies either Xi or for any i, but not both. 
Candidate nnsatishable subsets are therefore in correspondence with truth 
assignments on the variables 37 . Moreover, all variables tty are true, which 
makes II4 equivalent to U/=i,...,m{7j V }- If IT contains Xi, then -1 Xi can be 
removed from any clause 7 ^ containing it, while —>Zi remains. The opposite 
happens if z t is in IT. 

Either way, if a variable of {x^, Zi} is in fl 7 , the other one is not mentioned 
in fT, so we can assign it to false in order to satisfy as many clauses as 
possible (we are trying to prove unsatisfiability, so we have to test the most 
unfavorable possibility). What remains of II4 is the set T in which all variables 
Xj has been removed, by assigning them either to true (if E$ C II') or to false 
(if E' C IT). Therefore, the choice of including Ej or E' makes II4 equivalent 
to T after setting x % to some truth value. Therefore, II has an unsatishablc 
subset of size k if and only if dAVE.-T is true. 

Note that the choice of an unsatisfiable set II is not necessary. Indeed, 
by adding a new variable u to all clauses, II and all its subsets are made 
satisfiablc. Since II is now equivalent to u, one of its subsets can be equivalent 
to it only if, assigning false to u, leads to unsatisfiability, which has been 
proved to be equivalent to the QBF problem. □ 

This theorem implies that, unless the polynomial hierarchy collapses, the 
problem of checking the existence of I.E.S.’s of size bounded by k is not in any 
class below Ef. As a result, the definition of the problem is not equivalent to 
a condition that contains “less quantifiers”, unless an exponential blow-up 
is introduced. In other words, any condition that do not require checking 
exponentially sized formulae will contain an initial part “there exists some¬ 
thing...” similar to the part “there exists IT...” of the original definition. 
Such a simpler equivalent condition would indeed imply that the problem is 
in NP or coNP. 

This is not the case for the problem of checking the membership of a 
clause to all I.E.S.’s, on the other hand: while the initial definition is “for 
all IT C II,...”, we proved it equivalent to n\{ / y} \/= 7 . This simplification 
is however only possible because the problem is easier than what appears 
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from the definition: while the definition of the problem can be expressed as 
a V3QBF (implying that the problem is in Ilf,) Lemma 1 proved that the 
problem is actually in coNP. The next theorem also shows that the problem 
is hard for that class (and cannot therefore be simplified to a condition that 
do not require a satisfiability/entailment test at all.) 

Theorem 4 Deciding whether a clause is necessary in a set (it is contained 
in all its I.E.S. ’s) is NP -complete. 

Proof. By Lemma 1, a clause is necessary if and only if n\{'y} ^ 7 , and this 
problem is in NP. Hardness easily follows from Lemma 5: since all clauses 
of T[C, a] are irredundant but (possibly) the last one, r[C, a] has exactly one 
I.E.S., which is either r[C] or r[C, a], depending on the satisfiability of T. 
As a result, the only clause of r[C, a]\r[C] is in all I.E.S.’s if and only if T 
is satisfiable. □ 

While deciding whether a clause is in all I.E.S.’s is in NP, the similar 
problem of deciding whether a clause is in at least one I.E.S. is complete for 
the class Ef, and is therefore harder. This result is somehow surprising, as 
these two problems have very similar definitions, and checking the existence 
of an I.E.S. containing a clause may look even simpler than checking all of 
them. 

Theorem 5 Deciding whether a clause 7 is in at least one I.E.S. of a set of 
clauses n is Tf^-complete. 

Proof. Membership is trivial: the problem can be expressed as the existence 
of a set n'CIl containing 7 that is equivalent to n and irredundant. 

Hardness is proved by reduction from 3VQBF. We assume that the matrix 
of the QBF formula is the negation of a CNF: this problem is Ef-hard, as 
it is the complement of deciding whether a V3QBF formula, in which the 
matrix is in CNF, is valid [SM73]. We prove that 3AA/T.-T is valid (where 
T = {7 1? . .., 7 m }) if and only if w is in at least one I.E.S. of the following set 

n: 


n = (J {xt, -ay} U {w} U (J {w -> 7 *} 

£=l,...,n 

This set is clearly unsatisfiable. Its I.E.S.’s are its unsatisfiable minimal 
subsets. Let us now show how a subset n' of this kind is composed. If 
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both x t and ->Xi are in II', they are enough to generate contradiction, so no 
other clause can be in IT, otherwise the other clauses would be redundant. 
We have therefore found a first group of minimal unsatishable subsets of ff: 
those composed exactly of a pair {xi, -ix*}. 

Let us now try to build an unsatishable fL C If that contains w. Besides 
w, such set IT can include Uj=i m { w 7*}, as well as a literal between 
Xi and —ux'j for any i (but not both, otherwise the other clauses would be 
redundant). It is now evident that such set can be unsatishable only if, for 
the given choice of the xf s, the set T is unsatishable. Thus, there exists an 
unsatishable subset of II containing w if and only if T is unsatishable. What 
remains to prove is that any I.E.S. obtained by removing redundant clauses 
from IT contains w, but this is an easy consequence of the fact that lT\{i(;} 
is satishable. □ 

The hardness result proves that, unlike the necessary condition, the def¬ 
inition of usefulness cannot be reduced to a simple entailment/satishability 
check, unless the polynomial hierarchy collapses or some exponentially large 
formulae are used. 

The problem of uniqueness amounts to checking whether a set of clauses 
has a single I.E.S. This problem can be solved without cycling over all possible 
subsets of clauses, as Lemma 2 proves that finding the set of necessary clauses 
suffices. 

Theorem 6 Deciding whether a set of clauses II has a single I.E.S. is Af [log n] 
complete. 

Proof. By Lemma 2, all we have to do is to check whether the set of necessary 
clauses II ^ is equivalent to II. In turns, the set of necessary clauses can be 
found by checking II\{ 7 } 7 for each clause 7 G II. As a result, we 

perform a polynomial number of parallel calls to an oracle in NP (each one 
to check whether a clause is necessary) followed by a single other call (to 
check equivalence between IIjv and II.) By a well-known result by Gottlob 
[Got95], the problem is in A?)[log n ]. 

We prove that the problem of uniqueness is A?) [log n]-hard by reduction 
from the problem of odd satisfiability: given a sequence of sets of clauses 
(II 1 ,..., IP), each built on its own alphabet, such that the unsatisfiability of 
IP implies that of IP +1 , decide whether the Erst iffi that is unsatishable is 
of odd index, that is, k is odd. 
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For each set of clauses IF , we need an additional set of variables C J = 
K ..., c 7 ^} and three other variables a- 7 , V , and c 7 . We dehne 7 ^ = ->cj V 
• • • V -1 cP m . As proved by Lemma 5, IF [ C J ] implies 7 ^ V d, where d is a variable 
not occurring in IP [C J ] (e.g., a- 7 ), if and only if IF is unsatishable. 

Let j be an odd index between 1 and r. Dehne: 

ii;, = n j [c j ] u {7 i v « j v < J . v v v v < J \ 

IP D +1 = u j+ 1 [c j+1 ] u {7 j g v cj +1 v a j v -ib> | 7 1 +1 e lf + 1 } u 
{ 7 ^ V ci +1 V -V V V I y / +1 G IP +1 } 

Variables are only shared between IL^ and IL^ 1-1 , and only if j is odd. We 
therefore only have to check whether 11^ U IL ^ 1-1 has a unique I.E.S., where 
j is odd. 

Let us consider the easiest cases first. By Lemma 5, if IF is unsatishable, 
then the clauses y 7 V e 7 and 7 ^ V e 7+1 are entailed by IF[C'- 7 ]. As a result, all 
clauses but those in W D UIL ^ 1 are redundant. Since the clauses in Lf^ UIL ^ 1 
are irredundant, we have a single I.E.S. IL^ U H^ -1 . 

The second easy case is when IL ^ 1 unsatishable. By Lemma 5, IF +1 [C' J+1 ] 
implies -ic } +1 V that is, at least a variable cj +1 is false. As a 

result, (a - 7 = V) V 7 ^ is entailed. Therefore, the two last clauses of 11^ are 
made equivalent; therefore, one of them can be removed, but not both. By 
Lemma 3, II has have more than one I.E.S. 

The longest part of the proof is to prove that, if both IF and IF +1 are 
satishable, then all clauses are irredundant. This is proved by showing, for 
each clause, a model of the other clauses that is not a model of it. For the 
clauses in IF [(A 7 ] this is the model 07 (C- 7 ) of Lemma 4, extended by setting 
all C - 7 to false, and c 7 , a- 7 , and h 7 to true. 

For the clause a - 7 V c 7 V 7 ^, we choose the model evaluating all c( and cf +1 
to true, all variables of IP and IF +1 according to their respective models, 
both a - 7 and c? to false, and h 7 to true. This model does not satisfy a - 7 V 
c 7 V 7 ^ by construction, but satisfies all other clauses. Indeed, all clauses of 
W[C j ] U UP +1 [C j+1 ] are satisfied because we have chosen the models of IF 
and IF +1 , the clause V V c 7 V y 7 is satisfied because of h 7 , and the clauses 
c{ +1 V [—•]a - 7 V [->]b j Vy 7 are satished because of cj +1 . For the clause b j Vc 7 Vy 7 , 
the model with the values of a - 7 and h 7 swapped works in the same way. 

The clause c( +1 Va - 7 V-ifr 7 Vy ; 7 is falsified by the model that evaluates c | +1 to 
false, a - 7 to false, h 7 to true, c 7 to true, all cj to true, all c{ +1 with z ^ i to true, 
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and the variables of IP and IP +1 according to their respective models. This 
model satisfies all other clauses: indeed, the clauses of IT [C J ] U IT +1 [C J+1 ] 
are satisfied by the choice of the variables of IT and IT +1 ; all clauses with IP 
or cP are satisfied as well; the only remaining clauses are those of the form 
c{ +1 V <T V -i IP V 7 J g , with z yP i, which are however satisfied by the truth value 
ofd> z +1 . ' □ 

4 Query Equivalence 

The definition of equivalence that is most commonly used is that of logical 
equivalence: two formulae are equivalent if and only if they have the same 
sets of models. This definition is the same as the following one. 

two formulae IIi and II 2 are logically equivalent if and only if, for 
any formula T, it holds II 1 (= T if and only if II 2 |= T. 

This definition is formally equivalent to the previous one, but emphasizes 
a common use of propositional formulae: if a formula 11 ^ represents a piece of 
knowledge, reasoning is usually (but not always) done in terms of queries. In 
turns, querying a knowledge base means checking whether some facts follow 
from it or not. Formally, given a piece of knowledge represented by a formula 
III, querying it means checking whether a fact represented by another formula 
T follows from it, that is, whether IIi |= T. If the above condition on IIi and 
II 2 holds, we can say that II 2 represents the same knowledge as IIi as these 
two formulae are indistinguishable from the point of view of reasoning. 

This new definition of equivalence is of interest because it can be ex¬ 
tended in many directions. Namely, if not all formulae are possible queries, 
it does not coincide any more with logical equivalence. Two cases have been 
considered in the past: 

1 . we are only interested in queries that are in a particular syntactic form, 
for example, the Horn form [CD97]; 

2 . we are only interested in formulae about a subset of variables [CDSS97, 
LLM03], 

On the other hand, we may also interested in a set of queries that strictly 
include the set of propositional formulae. This is the case, for example, when 
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queries can be conditional formulae like T > E, which means “if T were true, 
would E holds?” 

3. we are interested into all possible conditional queries [LZ]. 

Intuitively, T > E is entailed by II if and only if E follows from the 
formula that is obtained by revising II with T. This motivates this kind 
of equivalence: two formulae are equivalent if and only if they are logically 
equivalent, and remain so regardless of updates. This kind of equivalence is 
related to strong equivalence in logic programming [LPV01], and has been 
defined for propositional logic by Liberatore and Zhao [LZ], 

We call any form of equivalence that is based on a particular set of conse¬ 
quences query equivalence (this name has been used by Cadoli et al. [CDSS97] 
for the definition based on a subset of variables, but it is somehow inappro¬ 
priate as other sets of queries make sense.) The two forms of equivalence 
above (based on considering subsets of propositional formulae) are called 
Horn equivalence and var-equivalence, respectively. The form of equivalence 
based on conditional statements is instead called strong equivalence or con¬ 
ditional equivalence. 

Since redundancy is defined in terms of equivalence (a set is redundant 
if and only if it is equivalent to a proper subset of its,) using a definition 
of equivalence that is different from the logical one leads to different prop¬ 
erties and results. Using query equivalence, redundancy tells which clauses 
are really necessary w.r.t. a given set of queries. We only consider two kinds 
of equivalence: var-equivalence and conditional equivalence. The two cor¬ 
responding forms of redundancy are called var-redundancy and conditional 
redundancy. 

4.1 Var-Redundancy 

Var redundancy is defined in the same way as logical redundancy, but using 
var-equivalence instead of logical equivalence. This kind of equivalence is 
called query equivalence by Cadoli et al. [CDSS97] and var-equivalence by 
Lang, Liberatore, and Marquis [LLM03]. We prefer the second name, and 
reserve the first one for the more general concept of equivalence based on an 
arbitrary set of queries. Formally, var-equivalence is defined as follows. 

Definition 7 (Var-Equivalence [LLM03]) Two formulae Hi and n 2 are 

var-equivalent w.r.t. a set of variables V if and only if, for each formula 
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r over variables V, it holds IIx |= T if and only if n 2 |= T. We denote 
var-equivalence between IU and n 2 by IU = n 2 . 

If V is the set of all variables, = and = coincide. On the other hand, if V 
is only composed of a subset of the variables, these two kinds of equivalence 
are different. In particular, while checking equivalence is coNP-complete, 
checking var-equivalence is Ilf-complete [LLM03]. As a result, checking var- 
redundancy is expected to be different from redundancy, and to be harder. 
The following equivalent condition of var-equivalence simplifies the subse¬ 
quent proofs. 

Theorem 7 II i = Il 2 holds if and only if, for any cube <5 over V (i.e., a 
non-tautological clause containing all variables over V), it holds Ill |= 5 if 
and only if 11 2 [=5. 

Proof. Follows from the fact that any formula over variables V can be ex¬ 
pressed as a conjunction of cubes over V. □ 

This theorem simply tells us that equivalence can be checked by looking 
at the cubes over V , rather than checking all possible formulae. This theorem 
also implies that all formulae that are var-equivalent are also var-equivalent 
to some formulae that only contain variables of V: one such formula is the 
disjunction of all cubes over V that are implied. This formula is called the 
forgetting of the variables that are not in V [LLM03]. 

Since cubes correspond to models, a similar property based on partial 
models holds. To this aim, we have however to give a special definition of 
model satisfaction. 

Definition 8 (Var-models) A model l oy over variables V is a var-model 

of II if and only if the set of literals implied by ujy is consistent with II. We 

v 

denote this fact as uy |= II. 

In other words, ujy is a var-models of II if and only if there exists another 
model a/ over the set of variables not in V such that ujyu' |= II. Using 
this definition of models, we can give a semantical characterization of var- 
equivalence. 

Theorem 8 II! = II 2 holds if and only if, for any model ujy over V it holds 
v v 

uy \= IU if and only if uy \= Il 2 . 
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The definition of var-redundancy differs from that of redundancy only 
because logical equivalence is replaced by var-equivalence. 

Definition 9 (Var-Redundancy of a Clause) A clause 7 is var-redundant 
in II w.r.t. variables V if and only if If\{ 7 } = II. 

The fact that var-redundancy is different from redundancy can be seen 
from the following formula using V = {x}: 

n = {x,y} 

II is logically irredundant. However, the clause y is var-redundant in n 
w.r.t. V = {a;}: if queries are restricted to formulae built on the variable x 
only, then the clause y is not needed. Note that var-redundancy does not 
depend only on the variables a clause contains: the clause 7 = —«?/ is not 
var-redundant in the set II = {1 V 1/, ->y} w.r.t. V = {x} even if it does not 
mention any variable in V. 

Definition 10 (Var-Redundancy of a Set) A set of clauses is var-redundant 
if and only if it contains a clause that is var-redundant in it. 

Since entailment is monotonic, var-irredundancy of all clauses of n is the 
same as the non-equivalence of n with one of its proper subsets. The problem 
is therefore not harder than the problem of equivalence, as a linear number 
of equivalence checks that can be done in parallel are as hard as a single one. 

Since var-equivalence is harder than logical equivalence (nf-complete 
[LLM03] vs. coNP-complete), we expect var-redundancy to be harder than 
logical redundancy. However, it is also easy to prove that redundancy is in 
the same class of the corresponding equivalence problem, as it amounts to 
solve a number of equivalence problems that can be done in parallel. Prov¬ 
ing that var-redundancy is hard for the same class, instead, is slightly more 
difficult. The following property is useful. 

Lemma 6 A clause 7 is var-redundant in n w.r.t. V if and only if any 
var-model o/n\{ 7 } over V is a also a var-model of n. 

If a clause 7 only contains variables of V, checking redundancy is relatively 
easy, as it amounts to checking whether 7 is logically implied by the other 
clauses. As a result, the n^-hardness of the problem of redundancy of a single 
clause can only be proved if the clause contains some literals not in V. 
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Theorem 9 Checking whether a clause is var-redundant w.r.t. V in a set is 
Ii 2 -complete. 

Proof. Membership follows from the fact that checking var-equivalcnce is 
in Ii p 2 . Hardness is proved by showing that (->a VE)U {a} is var-equivalent 
w.r.t. X to (-ia V E) if and only if VA3YIE, where (->a V E) denotes the set 
{—■a V 7 | 7 G E}. 

Assume \/X3Y . E. This assumption can be rephrased as: all partial mod¬ 
els over X can be extended to form a model of E. All models over A" 
can then be extended to form a model that satisfies both (->a V E) U {a} 
and (-i a V E) by simply adding the evaluation of a to true, that is, all 
var-models of ->a V E are var-models of (->a V E) U {a}. 

Assume 3XVY . ->E. We prove that (~>a V E) U {a} and (-> a V E) are not 
equivalent. Let c ox be the model over A" such that E is false regardless 
of the value of Y. We show that ujx is a var-model of (->a V E), but 
not of the other formula. Extending the model Ux with the model that 
sets a to false and Y to any value, we obtain a model of (~<a V E) simply 
because all clauses in this set contains ~>a. 

Let us now prove that u>x is not a var-model of (~<a V E) U {a}, i.e., 
it cannot be extended to form a model of (-i a V E) U {a}. By the 
contrary, let ojy be the partial model of Y such that ux 0 Jy 0 J a satisfies 
this formula. Since the formula contains a, the model u a must set a to 
true. As a result, the formula can be reduced to E. This implies that 
there exists uy that extends u>x to form a model of E, contradicting 
the assumption. 


□ 

The following theorem shows the complexity of var-redundancy of a set 
of clauses. This problem has the same complexity of var-redundancy of a 
single clause, as in the case of logical redundancy. 

Theorem 10 Checking var-redundancy of a set of clauses is If p 2 -complete. 

Proof. Membership follows from the fact that a set is var-redundant if and 
only if n\{ 7 } = n holds for some clause 7 6 II. These queries can be done 
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in parallel. Therefore, the problem is in the same class of the single test, 
which is in Ilf. 

Hardness is proved by showing that \/X3Y . E holds if and only if the fol¬ 
lowing set n is var-redundant w.r.t. V = X UBUC, where E = { 71 ,..., 7 m }. 
We assume, without loss of generality, that E contains at least two clauses, 
and it does not contain any tautological clause. 


n = {tt}U U 
where 

7T = —>Ci V • • • V —>c m V a 

n i = {<7 ->noV 7 i} U {—«a -»• bi] U {l 3 -> b { j l 3 G 7,;} 

Each set n, entails the clause <7 —> bi This is indeed the result of resolving 
all clauses of H; together. As a result, <7 —> bi is a consequence of n. Being 
composed of variables of V only, this clause must also be entailed by any 
var-equivalent formula. 

All clauses of n* are irredundant in n. This is proved by showing that, 
removing one clause of n.; from n, a new var-modcl is created. Since n 
entails <7 —» bi, the following partial model cannot be extended to form 
a var-modcl of n, as it evaluates <7 to true but b t to false. 

ubc = { (k , U {—«Cj, bj | j G {1,..., m}\i} 

We prove that, removing a clause 5 G n*, this model can be extended to 
form a model, that is, c obc can be extended to form a model of n\{ 5 }. 
Since m > 2 by assumption, ubc evaluates a variable c 3 to false, and 
the clause n is therefore satisfied. The model ubc also satisfies all sets 
n,- with j 7 ^ i. We therefore only have to prove that, removing a clause 
from n.j, the model c Obc can be extended to form a model of the other 
clauses of Hj. 

Ci —> -ia V 7 the model 10 with a;(a) = true and c o{lj) = false for any 
lj G 7 i is such that ojbc^ (= n*\{ci —>• ->a V 7 ^}; 

-1 a —> bi'. the model u with uj(a) = false and oj{lj) = false for any 
is such that lubc^ 1 = n.j\{-io —■> 6 ,;}; 
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Ij —> bf we use the model u with u(a) = true, u(lj) = true, and u(l k ) = 
false for any l k e 7 * with k ^ j. Indeed, ubcw H IIj\{^ — > bi}. 

As a result, all clauses of II, are irredundant in II. In other words, II 
is redundant if and only if n is redundant in II. 

Assume 3XVY . -iE. We prove that a is irredundant. This is proved by 
showing that II\{ 7 r} has a var-model that II has not. This var-model is 
ojx^bCi where ux is the value of X that makes E falsified, while ubc 
evaluates all variables in B and C to true. This model can indeed by 
extended to form a model of II\{ 7 r} by simply setting a to false. On the 
other hand, assume that there exists uyu a such that oux^BC^Y^a \= II. 
Since tt G II, and ubc evaluates all q’s to true, we have u a (a) = true. 
As a result, all clauses q —> ->a V 7 j can be simplified to 7 j. We can 
therefore conclude that ujx^y |= £, contrarily to the assumption. 

Assume that n is irredundant in II. We show that there exists ujx that 
falsifies E regardless of the value of cuy. By assumption, there is a 
var-model of II\{7r} that is not a var-model of II. Let ujx^bc be such 
a var-model. 

If ujBc(ci) = false for some i, then lobc H tt- Since ubc is a var-model 
of II\{ 7 r}, there exists u> such that ubc^ |= lA-jV}- Since ubc \= we 
also have that u>bc^ \= n, contradicting the assumption that Ubc is 
not a var-model of II. As a result c<J B c( c i) — true for all indexes i. Since 
q —> bi is entailed by H, and, therefore, by II\{ 7 r}, we can conclude 
that ubc evaluates to true all variables of B U C. 

As a result, all formulae ->a —> bi and lj —> bi are satished by ubc- 
Moreover, q —> ->a V 7 j simplihes to ->a V 7 *, and 7 r simplifies to a. 
Since ux^bc is not a var-model of II, then ux is not a var-model of 
{—■a V 7 j} U {a}, which is equivalent to E. In other words, Ux cannot 
be extended to form a model of E. 


□ 


4.2 Conditional Equivalence 

Conditional equivalence (or strong equivalence) of two formulae holds when¬ 
ever the two formulae are equivalent and remain so regardless of updates. 
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This definition depends on how revisions of knowledge bases are done. If the 
semantics of revision is syntax-independent [Dal88], then conditional and log¬ 
ical equivalence coincide. On the other hand, objections to the principle of 
the irrelevance of syntax have been raised [Fuh91, Neb91, BGMS99, WasOl], 
and some revision semantics that depend on the syntax exist. They are 
mainly motivated by the fact that the syntactic form in which a formula is 
expressed tells more than its set of models. In this section, we only consider 
the basic definition of revision by Fagin, Ullman, and Vardi [FUV83] and by 
Ginsberg [Gin86]. 

Definition 11 Max(Jl,T) is the set of the maximal subsets of U that are 
consistent with T. Revision of II with V is defined as follows: 

n * r = V Max{ n, r) 

We now give an equivalent characterization of this form of revision. Given 
a set of clauses II, let 5n(cu) be the set of clauses of II that are satisfied by 
the model oj. 

Definition 12 (Satisfied Subset) The subset of II satisfied by oj is: 

S'n(cu) = {7 G II | oj \= 7} 

The result of revision can be characterize in terms of the set of models of 
the result. 

Lemma 7 The models o/II*r are exactly the models to ofT whose set Syi(uj) 
is maximal with respect to set containment. 

Proof. By definition, only models of T have to be taken into account. The 
set SfiO^) is the set of formulae of II that are satisfied by oj. Since oj is 
a model of T, we have that S'n(cu) U T is consistent. As a result, the only 
case in which oj is not a model of the revision is when this set is not one of 
those maximally consistent with T, that is, there exists a maximal II' such 
that IT U T is consistent, and 5n(tu) C II'. Since IT U T is consistent, it has 
models: let oj' be a model of II' U T. Since all clauses of II' satisfy oj', and 
IT is maximally consistent, we have II' = »S'n(a/). This proves that oj is not 
a model of II * T if and only if there exists oj' with Sn(u) C S'n(cu'). □ 

We can now formally give the definition of conditional redundancy of a 
clause, and of a set of clauses. 
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Definition 13 (Conditional Redundancy of a Clause) A clause 7 is con¬ 
ditionally redundant in II if and only if II is conditionally equivalent to 
n\{7}- 

The definition of conditional redundancy of a whole set can be defined 
in two different ways: first, a set is conditionally redundant if it contains a 
redundant clause; second, a set is conditionally redundant if it is equivalent 
to one of its proper subsets. We use the first definition. 

Definition 14 (Conditional Redundancy of a set of Clauses) A set of 

clauses fl is conditionally redundant if and only if it contains a redundant 
clause. 

This definition is not equivalent to the other one. For example, the set 
fl = {a V b, a V -1 b, a\/c,a\/ —«c} does not contain any conditionally redundant 
clause, but is conditionally equivalent to its subset fl' = (a V 6, a V -1 b}. This 
difference is caused by the fact that revision is a non-monotonic operator: 
most, but not all, non-monotonic logics show this phenomena [Libb]. 

Lemma 7 tells that a clause is redundant if and only if its removal modifies 
the ordering on models defined by C on fifi (•) • On the other hand, it may 
be that two models are incomparable before the removal of a clause and 
equal afterwards. As a result, the difference of the orderings caused is only 
a necessary condition to equivalence, not a sufficient one. 

Lemma 8 If 7 is irredundant in fl, then there exists two models to and oj' 
such that the containment relation between Su(oo) and Sn(u)') is different 
from that between Sn\{ 7 }(^) and Sn\{ 7 } (u/). 

Proof. Trivial consequence of Lemma 7: if the ordering is the same, then all 
revision results are the same. □ 

This condition is however not a sufficient one, in general: indeed, it may 
be that the ordering is different only because two sets that are incomparable 
becomes equal. If this is the case, the result of revision is always the same. 
On the other hand, such a case is not possible if the two formulae only differ 
for one clause. 

Lemma 9 7 is irredundant in II if and only if there exists two models u and 
uj' such that the containment relation between Su(ou) and Sn(u/) is different 
from that between An\{ 7 }(cc;) and <Sn\{ 7 } (k/) • 



Proof. The “only if” part is Lemma 8. We only have to prove that, if 
the containment relation between S'n(w) and Su(cu') is different from that 
between S'n\{ 7 }(w) and S'n\{ 7 }(ca / ), then 7 is irredundant. 

If both to and u/ satisfy 7, then its removal does not change the relation¬ 
ship between S'n(a’) and Sri (a/), as 7 is removed from both. On the other 
hand, if none of these models satisfy 7, then the sets Sn(cu) and Sn(y/) are 
not modified at all. 

The only remaining case is therefore that one of these two models satisfy 
7 while the other does not. Without loss of generality, assume that u satisfies 
7 while c o' does not. As an immediate result, we have that Sn(w) % Sn(k/), 
since the first set contains a clause the other one does not. Moreover, we 
have that 


5 n\{ 7 }M = S11MU7} 

Sn\{ 7 }(u/) = «Sn(a/) 

In other words, the only effect of removing 7 is to remove 7 from the set 
of clauses that are satisfied by u, while the clauses satisfied by u' are the 
same. 

We prove that the inverse containment is not modified by the removal of 7. 
Formally, we prove that Sn(y/) C Sn(cu) if and only if Sn\{ 7 } ( a /) C Sn\{ 7 }(a>). 

1. If Sn(u/) C 5n(cu) holds, using the equations above we have that 
S n \{ 7 }(y/) Q S n \{ 7 }(cu) U {7}. Since 7 ^ Sn\{ 7 }(k/)-> this is equiva¬ 
lent to Sn\{ 7 }(c</) C S n \{ 7 }(cu). 

2. If S'^\{ 7 }(c^; , ) C An\{ 7 }(cu), by using the equations above, we have that 
Sn(w') Q An(<^)\{7}, which implies that Sji(u') C An(cc;). 

We can therefore conclude that the only possible change of relationship 
between the clauses that are satisfied by to and those satisfied by u/ is that 
Sn(u) % Sn(ou') but Sn\{ 7 }(<^) C <S'n\{ 7 } (<^0 , while the set containment in 
the other direction is preserved. 

Let T be the formula that has uj and u' has its only two models. We 
prove that the revision by T is affected by the presence of 7. Formally, 
we show that II * T is different from II\{7} * T. We have already shown 
that ,S'n(w) % and that the inverse containment is not changed by 

the removal of 7. Since the containment relation changes by assumption, 
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we also have that 5'n\{ 7 }(cu) C <S'n\{ 7 }(A’ / )- Two cases are possible: either 
Sn(u') C Sn(u)) or not. Let us consider each case separately. 

• If Sn(uj') C Sn(w), since Sn(uj) % S’n(cu'), we have SfityT) C Sn(w)- As 
a result II * T has u as its only model. On the other hand, 5'n\{ 7 }(cn) C 
*S , n\{ 7 }(^0• A s a result, u) and u' are evaluated in the same way by 
Sn\{ 7 }(-)- As a result, n\{ / y} * T has both of them as models. 

• If Sri (a/) % Sn(cn), since Sn(u) % Sji(u'), we have that u> and a/ are 
incomparable in II. As a result, II * T has both of them as models. 

On the other hand, we have that Sn\{ 7 }(cn) C <Sn\{ 7 } (a/), and therefore 
Sn\{ 7 }(cu) C Sn\{ 7 }(c</). As a result, u’ is strictly preferred over to in 
n\{ 7 }. As a result, n\{ / y} * T has u 1 as its only model. 

We have therefore proved the following: if the removal of 7 changes the 
relationship between two models u and a/, then the only possible change is 
that Sn( 0 ) % Sn(u') and Sn\{ 7 }(w) C Sn\{ 7 }(cn / ), while the inverse contain¬ 
ment relationship is not changed. We have then proved that such a change 
leads to different results when II and II\{ 7 } are both revised by the same 
formula T. As a result, 7 is irredundant. □ 

This lemma proves that the irredundancy of a clause is related to the 
modification of the set containment of the sets of clauses that are satisfied by 
the models. On the other hand, this condition is only about the redundancy 
of a single clause. If we allow removing two clauses, the ordering can be 
modified while conditional equivalence is preserved. 

Theorem 11 The following two sets of clauses are conditionally equivalent, 
but the ordering they induce are different, and all clauses of II are irredun¬ 
dant. 


II = {a V b, a V -1 b, a V c, a V —«c} 

IT = {a V b, a V -16} 

Proof. We prove that II does not contain any redundant clause. Its symmetry 
allows proving it for a single clause only. Let us therefore show that a V b is 
irredundant. Consider the following revising formula T = —>a. The maximal 
subsets of II that are consistent with T are composed of exactly one clause 
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between a V b and a V —<6, and one clause between a V c and a V ->c. As a 
result, n*r = -ia. 

Let us now consider II\{a V6} *T. The maximal subsets of II\{a V6} that 
are consistent with ->a contains the clause aV->&, and one clause between aVc 
and a V-ic. As a result, all maximal subset contains a V —>6, which is therefore 
in II\{a V b} * T. We can therefore conclude that Il\{a V b} * T — ->a A ->b. 
Since this is different from IT * T, the clause a V b is irredundant in II. 

We now prove that IT is conditionally equivalent to II. Let to and to' be 
two models. If they both satisfy c or they both satisfy —«c, the set of satisfied 
clauses are modified in the same way. On the other hand, if one of them 
implies c and the other one implies —«c, then they are incomparable in II, but 
equal in Lib The only difference is therefore that some pairs of models are 
incomparable in II but equal in lib As a result, the maximal ones are always 
the same. 

While II and II' are conditionally equivalent, there exist two models that 
are compared differently in II and Lib Let to and to' be the models such that 
to [= -<a A ->b A -ic and to' f= ->a A ->b A c. The sets Sn(u) and Sn(u r ) are not 
comparable: the first contains a V ->c but not a V c, while the second contains 
the second one but not the first. As a result, the ordering is changed. □ 

The following lemma makes the statement of Lemma 9 more precise: not 
only there is a pair of models whose ordering is modified: this ordering is 
modified in a very specific way. 

Lemma 10 7 is conditionally irredundant in II if and only if there exists 
two models to and to' such that: 

Sn{u)\SYi(to') = {7} 

Proof. If two such model exists, then we have that Sn(w) % Su(to'), since 
S'n(cn) contains a clause that is not in on the other hand, since 7 is 

the only clause that is in 5n(cn) but not in S'n(cn'), removing it from both 
sets leads to S'n\{ 7 }(cn) C <Sn\{ 7 }(u/). This result tells us that the removal of 
7 modifies the relationship between the set of clauses that are satisfied by 00 
and by cub By Lemma 9 , this implies that 7 is irredundant. 

Let us assume that 7 is irredundant. By Lemma 8, there are two models 
to and to' such that the containment relation between S'n(k’) and S'n(a/) is 
affected by the presence of 7 in II. If to and to' evaluate 7 in the same way 
(i.e., either both or none of them satisfy it), then removing 7 modifies their 
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sets ,5'n (•) in the same way (either 7 is removed from both, or it is not in 
either already.) 

As a result, either u or u' satisfy 7, but not both. Without loss of 
generality, we can assume that u is the model that satisfy 7. As a result, 
we have that 7 G S'n(k ; )\<S'n(^ / )- We therefore only have to prove that no 
other clause is in this difference. On the converse, assume that S' n (k ; )\<Sn(k ;/ ) 
contains another clause 7', that is {7,7'} C S , n(co’)\S'n(a; / ). The only effect of 
removing 7 is that 7 disappears from the set of clauses satisfied by u ;; on the 
other hand, 7' is still there. As a result, the relationship between the set of 
satisfied clauses remains the same. Formally, two cases are possible: either 
u satisfies all clauses that are satisfied by u/, or u/ satisfies some clauses 
more. In the first case, the removal of 7 does not change the relationship 
because u still satisfies all clauses of c o' and 7'. In the second case, the sets of 
satisfied clauses are still incomparable, as u/ satisfies the same clauses, while 
c 0 satisfies 7'. □ 

We have now all technical tools to prove the complexity of checking re¬ 
dundancy of a single clause in a set. 

Theorem 12 Checking whether 7 is conditionally redundant in II is coNP- 
complete. 

Proof. Membership: a clause is redundant if and only if there exists two 
models such that their ordering is affected by the presence of the clause. 

Hardness: the set of clauses n is satishable if and only if the clause a is 
conditionally redundant in S = (aVn)U {a}, where a VII is a shorthand for 
{a V 7 | 7 G n}. We divide the proof in two parts: first, we consider the case 
in which n is satishable, and prove that a is irredundant; second, we show 
that the irredundancy of a implies the satisfiability of n. 

If n is satishable, it has a model w>x- We show that 7 is irredundant in 
E by considering two models: the hrst one is u, which is obtained by adding 
the evaluation a = false to w>x\ the second one is c ut, the model that sets all 
variables to true. The hrst model satisfies all clauses but a; the second model 
satisfies all clauses. As a result, we have that a is the only clause satisfied by 
ut that is not satisfied by u>, that is: = {a}. By Lemma 10 , 

this implies that a is irredundant. 

Let us now assume that a is irredundant. By Lemma 10 , Sz(cu)\Sz(u') 
is equal to {a} for some pair of models u and c o'. This condition implies that 
u satisfies a while u' does not. Since u satishes a we have that Sy,(u) = E 
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since all clauses of £ contains a. As a result, (a/) = £\{a}, that is, u/ 
satisfies all clauses of a V II. Since u/(a) = false, the model u>' satisfies all 
clauses of fl. □ 

We now prove that checking whether a formula contains a redundant 
clause is coNP-complete as well. Note that the redundancy of a formula 
is defined as the presence of a redundant clause in the set, and not as the 
property of being equivalent to a proper subset. These two definitions are 
not equivalent, as shown by Theorem 11 . 

Theorem 13 Checking redundancy (i.e., presence of a redundant clause) of 
a set of clauses is coNP -complete. 

Proof. Membership is proved as usual: we have to check the redundancy 
of some clause; these tests can be done in parallel, and therefore the whole 
problem is in coNP. 

Hardness is proved as follows: we prove that the clause a is redundant in 
the following set £ if and only if n is unsatisfiable: 

£ = {—<q V a V 7i | 7i e n} U {q V a | 7i e n} U {a} 

We indeed prove the following: first, all clauses but a are irredundant. 
Second, that a is redundant if and only if n is satishable. 

All clauses of £\{a} are irredundant. This is proved by showing, for 
each of them, a possible revising formula Y such that £ * Y is different 
than £\{< 5 } * Y for each clause 6 of £ that is not a. 

—*Ci V a V 7j. The formula is Y = ->a A <7 A {A Cj \ j ^ i}. This formula 
satisfies <7 V a and all clauses ->Cj V a V 7^, and falsifies a and all 
clauses <7 V a. As a result, the only clause that is not satisfied 
neither contradicted is -iq V a V 7*. As a result, the result of the 
revision entails 7* if and only if this clause is present. 

Ci V a. We use the formula Y = ->a A {A<7 | j ^ A 7This formula 
falsifies a, all clauses Cj V a with j ^ i, and implies the clause 
—1 Cj V a V 7j because of 7*, and all clauses -iq V a V for any j 7^ i 
because Y (= —>Cj. As a result, the only clause that is not falsihes 
nor entailed is <7 V a. Its presence is needed to allow deriving -1 q 
from the revised theory. 
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If II is satisfiable a is irredundant. This is proved by showing a revis¬ 
ing formula that makes a needed for the entailmcnt of some formulae. 
Namely, since II is satisfiable it has a model c o. Let T be defined as 
follows: 


T = {q j 7 i G 11} U {xi | = true} U {-> Xi \ uj{xj) = false} 

In words, we set all q’s to true, and give to any Xi the sign that is in 
the model uj. This formula is clearly satisfiable. Moreover, it is almost 
complete, since the only variable that is not forced to have a specific 
value is a. Moreover, T implies all clauses: ->CiVaVyi is implied because 
uj satisfies all clauses j tl while q V a is entailed because T contains Cj. 
On the other hand, a is not falsified nor it is entailed. As a result, the 
presence of a in the result of revision is related to its presence in the 
original theory. 

If a is irredundant, II is satisfiable. This is proved by using the charac¬ 
terization of irredundancy provided by Lemma 10 : since a is irredun¬ 
dant, there exists two models uo and uj' such that: 

Sv{u)\Sv(u') = {a} 

Therefore, uj |= a and uj' a. We can now proceed by using the 
following rules: 

1 . every clause but a that is satisfied by uo is also satisfied by u/ 
(otherwise a would not be the only clause that is satisfied by to 
but not by u/); 

2 . every clause that is not satisfied by u/ is not satisfied by to as well 
(same reason); 

3 . if a model satisfies some clauses, it also satisfies all their conse¬ 
quences. 

This leads to the pictorial proof of Figure 1 . 

In words, the proofs proceeds as follows, using the rules above and the 
fact that oj \= a and u/ |A The latter is equivalent to uJ \= ->a. Since 
uj |= a we have that uj |= q V a. As a result, the same clause is satisfied 
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Figure 1 : If a is irredundant then II is satisfiablc 

by cy, that is, u' \= c t V a. Since u' \= ->a, we can conclude that u' |= q 
for all indexes i. 

Since u |= a, we also have that u \= ->Ci V a Vy*. As a result, u/ satishes 
the same clause, that is u/ |= -iCj V a V 7,. But we have already proved 
that u/ |= -ia and that u>’ \= Ci. As a result, we have that u' |= 7* for 
all i. This proves that c o' is a model of all clauses 7* e II. As a result, 
II is satisfiablc. 
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We can therefore conclude that all clauses of E but a are irredundant, 
and that a is redundant if and only if II is unsatisfiable. As a result, E is 
redundant if and only if II is unsatisfiable. □ 


5 Conclusions 

We have presented a study of the semantical and computational properties of 
concepts related to the redundancy of CNF propositional formulae. Namely, 
we have considered the problem of checking whether a formula is redundant 
and some problems related to removing redundancy from it. The computa¬ 
tional analysis has shown that checking redundancy is coNP-complcte. We 
have then defined an I.E.S. as an irredundant equivalent subset of a formula, 
and studied some problems related to I.E.S.’s: checking, size, uniqueness, and 
membership of clauses to some or all I.E.S.’s. All problems have been given 
an exact characterization within the polynomial hierarchy, that is, we have 
found classes these problems are complete for. The problem of redundancy 
has also been studied for the case of two alternative forms of equivalence 
based on particular sets of possible queries. 

Some problems are still open. Namely, irredundancy is only one way of 
defining minimal representation of a formula, but other ones exist. In the 
Horn case, several different definitions of minimality have been used, both 
by Meier [Mai 80 ] and by Ausiello et al. [ADS 86 ], including irredundancy and 
number of occurrences of literals. In the general (non-Horn) case, only the 
number of occurrences of literals (and, in this paper, irredundancy) have been 
considered. An open problem is whether the other notions of minimality used 
in the Horn case make sense in the general case as well. 

Some other problems have not been considered in this paper, and are 
analyzed in two other papers. In the first one [Liba], the complexity of the 
problem of redundancy has been analyzed for the case of Horn and 2 CNF 
formulae. The analysis of 2 CNF, in particular, has shown a very interesting 
pattern: while the properties of redundancy and irredundancy are different 
depending on whether the formula implies some literals or not, a concept of 
acyclicity makes often the difference between tractability and intractability. 
In the other paper [Libb] some non-classical logics have been considered: non¬ 
monotonic logics, multi-valued logics, and logics for reasoning about actions. 
An interesting issue of non-classical logics is that equivalence can be defined 
in different ways, and that the irredundancy of all parts of a knowledge base 


36 



does not always imply the irredundancy of the knowledge base. 
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